Raft Part I - Leader Election

Resource:

I just spent four days completing the first checkpoint for the Raft consensus algorithm, so here’s a summary. Raft will be divided into two parts. In the first part, I'll implement leader election in Go and explain the core test design. It’s quite fitting to work on a democratic election project right as the U.S. 2024 election approaches, haha.

Core Ideas

First, we need three states for the server: leader, follower, and candidate. In the article:

As part of program design, I’ve made this section more concrete by defining it in terms of several channels within the run() function, which handle signals as follows:

Implementation Details

run() function

The run() function is launched by a goroutine when initializing a server, and it acts as the central handler for various inputs.

followerClock() function

This function manages the timer for a Raft node in the follower state, detecting the leader’s heartbeats. If no heartbeat is received within a specified timeout, the node transitions to the candidate state and initiates a new election process.

election() function

After starting the election() function:

  1. Set identity to Candidate.
  1. Increment currentTerm to update the term.
  1. Set voteFor = me to vote for itself.

Then launch three functions:

go rf.electionClock(stopElectionClockCh)
go rf.elecParallelSend(stopElecParallelSend)
go rf.electionRecv(stopElectionRecv)

Afterward, handle signals:

elecParallelSend() and electionRecv() functions

When sending voteRequest, the article mentions it should be sent in parallel. Here we ensure concurrency, so the logic is correct. Use:

go rf.sendRequestVote(i, &RequestVoteArgs{
	Term:         curTerm,
	CandidateId:  rf.me,
	LastLogIndex: logLen - 1,
	LastLogTerm:  curTerm,
}, reply)

Send the RPC, and remember to lock when accessing the state variable; otherwise, it might conflict with other operations modifying state.

Upon receiving the RPC response, immediately return, so we set up a buffered channel to collect each voter’s result:

if ok && reply.VoteGranted {rf.voteIn <- true}

Then, collect the votes in the electionRecv() function with vote := <-rf.voteIn.

electionClock() function

The electionClock function is simpler than followerClock, needing only to check if the countdown finishes first or if an external signal ends the goroutine first.

RequestVote()

This is critical!!! As the RPC handler function, its correct implementation determines whether the election proceeds smoothly. Lock and unlock the entire function to ensure state consistency before and after voting. Three scenarios apply:

  1. args.Term > rf.state.currentTerm
    • Vote: reply.VoteGranted = true
    • Update Term: rf.state.currentTerm = args.Term
    • Record the vote for this candidate: rf.state.votedFor = args.CandidateId
    • Then, consider the following:
      • If this server is a Candidate:
        • Change to Follower and start rf.followerClock().
        • Stop the election rf.closeElectionCh <- struct{}{}
      • If this server is a Leader:
        • Change to Follower and start rf.followerClock().
        • Stop the heartbeat rf.closeHeartbeatClock <- struct{}{}
      • If this server is a Follower:
        • Reset the timer rf.resetFollowerClock <- struct{}{}
  1. args.Term == rf.state.currentTerm
    • Do not vote. Why? Even as a follower? Yes, because if they were the leader for this term, they should send a heartbeat rather than a vote request. They’re in the same term as you but still running an election? To be valid, they should be at least one term ahead!
  1. args.Term < rf.state.currentTerm
    • Refuse the vote and inform them of the latest Term: reply.Term = rf.state.currentTerm.

heartbeat(), sendHeartbeat() functions

Straightforward, each countdown resets and initiates a new countdown, starting a goroutine go sendHeartbeat().

sendHeartbeat() is simple, issuing RPC to all servers with go sendAppendEntries().

AppendEntries() function

Also crucial!!! Similar to RequestVote, it has three main cases. Lock at the beginning and unlock at the end.

Send the heartbeat Term to run for further processing: rf.heartbeatIn <- args.Term.

Test Design

TestInitialElection2A

TestReElection2A

TestReElectionHidden2A

TestSmallPartitionConsensusHidden2A

TestLeaderConsistencyHidden2A